{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# An RNN model to generate sequences\n",
    "RNN models can generate long sequences based on past data. This can be used to predict stock markets, temperatures, traffic or sales data based on past patterns. They can also be adapted to [generate text](https://docs.google.com/presentation/d/18MiZndRCOxB7g-TcCl2EZOElS5udVaCuxnGznLnmOlE/pub?slide=id.g139650d17f_0_1185). The quality of the prediction will depend on training data, network architecture, hyperparameters, the distance in time at which you are predicting and so on. But most importantly, it will depend on wether your training data contains examples of the behaviour you are trying to predict.\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "Things to do:<br/>\n",
    "<ol>\n",
    "<li> [Choose a waveform](#assignment1) then execute the entire notebook. See results at the bottom: not great...\n",
    "<li> Implement the RNN model and try again ([Assignment #2](#assignment2))\n",
    "<li> Check that your state is passed around correctly:\n",
    "    <ol>\n",
    "    <li> Did you use `dynamic_rnn(initial_state=Hin)` [in your model](#assignment3A) ?\n",
    "    <li> [During inference](#assignment3B): check the state (hint: it's OK)\n",
    "    <li> [In the training loop](#assignment3C) and also [when batching your data](#assignment3C): check the state (hint: 2 bugs)\n",
    "    </ol>\n",
    "<li> Make the predictions more robust ([Assignment #4](#assignment4)).\n",
    "</ol>\n",
    "    \n",
    "Play with these options until you get a good fit for at least 128 predicted samples. You can then try a [different waveform](#assignment1).\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import math\n",
    "import numpy as np\n",
    "from matplotlib import pyplot as plt\n",
    "import utils_prettystyle\n",
    "import utils_batching\n",
    "import utils_display\n",
    "import tensorflow as tf\n",
    "import math\n",
    "print(\"Tensorflow version: \" + tf.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"assignment1\"></a>\n",
    "## Generate fake dataset\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Assignment #1**: Choose a waveform. Three possible choices on the next line: 0, 1 or 2\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "WAVEFORM_SELECT = 0 # select 0, 1 or 2\n",
    "\n",
    "def create_time_series(datalen):\n",
    "    # good waveforms\n",
    "    frequencies = [(0.2, 0.15), (0.35, 0.3), (0.6, 0.55)]\n",
    "    freq1, freq2 = frequencies[WAVEFORM_SELECT]\n",
    "    noise = [np.random.random()*0.1 for i in range(datalen)]\n",
    "    x1 = np.sin(np.arange(0,datalen) * freq1)  + noise\n",
    "    x2 = np.sin(np.arange(0,datalen) * freq2)  + noise\n",
    "    x = x1 + x2\n",
    "    return x.astype(np.float32)\n",
    "\n",
    "DATA_SEQ_LEN = 1024*128\n",
    "data = create_time_series(DATA_SEQ_LEN)\n",
    "plt.plot(data[:512])\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"hyperparameters\"></a>\n",
    "## Hyperparameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "NB_EPOCHS = 5\n",
    "\n",
    "RNN_CELLSIZE = 80   # size of the RNN cells\n",
    "N_LAYERS = 1        # number of stacked RNN cells\n",
    "SEQLEN = 32         # unrolled sequence length\n",
    "BATCHSIZE = 32      # mini-batch size"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Visualize training sequences\n",
    "This is what the neural network will see during training."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# The function dumb_minibatch_sequencer splits the data into batches of sequences sequentially.\n",
    "for features, labels, epoch in utils_batching.dumb_minibatch_sequencer(data, BATCHSIZE, SEQLEN, nb_epochs=1):\n",
    "    break\n",
    "print(\"Features shape: \" + str(features.shape))\n",
    "print(\"Labels shape: \" + str(labels.shape))\n",
    "print(\"Excerpt from first batch:\")\n",
    "\n",
    "utils_display.picture_this_7(features)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"assignment2\"></a>\n",
    "<a name=\"model\"></a>\n",
    "## The model definition\n",
    "When executed, this function instantiates the Tensorflow graph for our model.\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Assignment #2**: implement a single-layer RNN using a GRU cell: `tf.nn.rnn_cell.GRUCell(RNN_CELLSIZE)`\n",
    "</div>\n",
    "\n",
    "<a name=\"assignment3A\"></a>\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Assignment #3.A**: check that state is passed around correctly: `dynamic_rnn(initial_state=Hin)`\n",
    "</div>\n",
    "\n",
    "<a name=\"assignment4\"></a>\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Assignment #4**: Make the predictions more robust. You can try the following:\n",
    "<ol><ol>\n",
    "    <li> Use a stacked RNN cell with 2 layers with in the model:<br/>\n",
    "```\n",
    "cells = [tf.nn.rnn_cell.GRUCell(RNN_CELLSIZE) for _ in range(N_LAYERS)]\n",
    "cell = tf.nn.rnn_cell.MultiRNNCell(cells, state_is_tuple=False)\n",
    "```\n",
    "        <br/>Do not forget to set N_LAYERS=2 in [hyperparameters](#hyperparameters) so that the input state gets initialized correctly.\n",
    "    </li>\n",
    "    <li> Regularisation: learning rate decay. Locate the line where the AdamOptimizer is instantiated and notice that this model uses an exponentially decaying learning rate. This allows you to train for longer. Set NB_EPOCHS=10 in [hyperparameters](#hyperparameters).</li>\n",
    "</ol></ol>\n",
    "Play with these options until you get a good fit for at least 128 predicted samples. You can then try a [different waveform](#assignment1). Dropout (another regularisation technique) could further improve this model but it is more interesting to try it out on real data: go to the next exercise for that.\n",
    "</div>\n",
    "\n",
    "![deep RNN schematic](../../../docs/images/RNN1.svg)\n",
    "<div style=\"text-align: right; font-family: monospace\">\n",
    "  X shape [BATCHSIZE, SEQLEN, 1]<br/>\n",
    "  Y shape [BATCHSIZE, SEQLEN, 1]<br/>\n",
    "  H shape [BATCHSIZE, RNN_CELLSIZE*NLAYERS]\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def model_fn(features, Hin, labels, step):\n",
    "    # inputs shapes during training (for inference, we will use BATCHSIZE=1 and SEQLEN=1):\n",
    "    # features [BATCHSIZE, SEQLEN, 1]\n",
    "    # labels [BATCHSIZE, SEQLEN, 1]\n",
    "    # Hin [BATCHSIZE, RNN_CELLSIZE*N_LAYERS]\n",
    "    X = features\n",
    "    batchsize = tf.shape(X)[0] # allow for variable batch size\n",
    "    seqlen = tf.shape(X)[1] # allow for variable sequence length\n",
    "    \n",
    "    # --- dummy model: please implement a real RNN model ---\n",
    "    Yn = X * tf.ones([RNN_CELLSIZE], name=\"dummy\") # Yn shape [BATCHSIZE, SEQLEN, RNN_CELLSIZE]\n",
    "    H = Hin\n",
    "    # TODO: create a tf.nn.rnn_cell.GRUCell\n",
    "    # TODO: unroll the cell using tf.nn.dynamic_rnn\n",
    "    # --- end of dummy model ---\n",
    "    \n",
    "    # This is the regression layer. It is already implemented.\n",
    "    # Yn [BATCHSIZE, SEQLEN, RNN_CELLSIZE]\n",
    "    Yn = tf.reshape(Yn, [batchsize*seqlen, RNN_CELLSIZE])\n",
    "    Yr = tf.layers.dense(Yn, 1) # Yr [BATCHSIZE*SEQLEN, 1] predicting vectors of 1 element\n",
    "    Yr = tf.reshape(Yr, [batchsize, seqlen, 1]) # Yr [BATCHSIZE, SEQLEN, 1]\n",
    "    \n",
    "    # Yr[BATCHSIZE, SEQLEN, 1]\n",
    "    Yout = Yr[:,-1,:] # Last output in sequence Yout [BATCHSIZE, 1]\n",
    "    \n",
    "    loss = tf.losses.mean_squared_error(Yr, labels) # shapes Yr[BATCHSIZE, SEQLEN, 1], labels[BATCHSIZE, SEQLEN, 1]\n",
    "    lr = 0.001 + tf.train.exponential_decay(0.01, step, 400, 0.5) # 0.001+0.01*0.5^(step/400)\n",
    "    optimizer = tf.train.AdamOptimizer(learning_rate=lr)\n",
    "    train_op = optimizer.minimize(loss)\n",
    "    \n",
    "    return Yout, H, loss, train_op # output shapes Yout[BATCHSIZE, 1], H[BATCHSIZE, RNN_CELLSIZE*N_LAYERS]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Instantiate the model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tf.reset_default_graph() # restart model graph from scratch\n",
    "\n",
    "# placeholder for inputs\n",
    "Hin = tf.placeholder(tf.float32, [None, RNN_CELLSIZE * N_LAYERS])\n",
    "features = tf.placeholder(tf.float32, [None, None, 1]) # [BATCHSIZE, SEQLEN, 1]\n",
    "labels = tf.placeholder(tf.float32, [None, None, 1]) # [BATCHSIZE, SEQLEN, 1]\n",
    "step = tf.placeholder(tf.int32)\n",
    "\n",
    "# instantiate the model\n",
    "Yout, H, loss, train_op = model_fn(features, Hin, labels, step)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"assignment3B\"></a>\n",
    "<a name=\"inference\"></a>\n",
    "## Inference\n",
    "This is a generative model: run one trained RNN cell in a loop\n",
    "\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Assignment #3.B**: Check that the RNN state is passed around correctly (hint: it's OK)\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def prediction_run(prime_data, run_length):\n",
    "    H_ = np.zeros([1, RNN_CELLSIZE * N_LAYERS]) # zero state initially\n",
    "    Yout_ = np.zeros([1, 1])\n",
    "    data_len = prime_data.shape[0]\n",
    "\n",
    "    # prime the state from data\n",
    "    if data_len > 0:\n",
    "        Yin = np.array(prime_data)\n",
    "        Yin = np.reshape(Yin, [1, data_len, 1]) # reshape as one sequence of length data_len\n",
    "        feed = {Hin: H_, features: Yin}\n",
    "        Yout_, H_ = sess.run([Yout, H], feed_dict=feed)\n",
    "    \n",
    "    # run prediction\n",
    "    # To generate a sequence, run a trained cell in a loop passing as input and input state\n",
    "    # respectively the output and output state from the previous iteration.\n",
    "    results = []\n",
    "    for i in range(run_length):\n",
    "        Yout_ = np.reshape(Yout_, [1, 1, 1]) # batch of a single sequence of a single vector with one element\n",
    "        feed = {Hin: H_, features: Yout_}\n",
    "        Yout_, H_ = sess.run([Yout, H], feed_dict=feed)\n",
    "        results.append(Yout_[0,0])\n",
    "        \n",
    "    return np.array(results)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Initialize Tensorflow session\n",
    "This resets all neuron weights and biases to initial random values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# first input state\n",
    "Hzero = np.zeros([BATCHSIZE, RNN_CELLSIZE * N_LAYERS])\n",
    "# variable initialization\n",
    "sess = tf.Session()\n",
    "init = tf.global_variables_initializer()\n",
    "sess.run([init])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"assignment3C\"></a>\n",
    "<a name=\"training\"></a>\n",
    "## The training loop\n",
    "You can re-execute this cell to continue training\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "**Assignment #3.C**: find and resolve RNN state bugs.<br?>\n",
    "**hint**: there are 2 bugs. One in the core of the training loop. Find the state and pass it around correctly. The second bug is in the way the data was sliced into batches of sequences. Special care is needed when batching sequences for an RNN. [See this slide](https://docs.google.com/presentation/d/18MiZndRCOxB7g-TcCl2EZOElS5udVaCuxnGznLnmOlE/pub?slide=id.g139650d17f_0_584) to understand the situation. You can fix it by using `rnn_minibatch_sequencer` instead of `dumb_minibatch_sequencer`)\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "H_ = Hzero\n",
    "losses = []\n",
    "indices = []\n",
    "for i, (next_features, next_labels, epoch) in enumerate(utils_batching.dumb_minibatch_sequencer(data, BATCHSIZE, SEQLEN, nb_epochs=NB_EPOCHS)):\n",
    "    next_features = np.expand_dims(next_features, axis=2) # model wants 3D inputs [BATCHSIZE, SEQLEN, 1] \n",
    "    next_labels = np.expand_dims(next_labels, axis=2)\n",
    "\n",
    "    feed = {Hin: Hzero, features: next_features, labels: next_labels, dropout_pkeep: DROPOUT_PKEEP}\n",
    "    Yout_, _, loss_, _ = sess.run([Yout, H, loss, train_op], feed_dict=feed)\n",
    "    # print progress\n",
    "    if i%100 == 0:\n",
    "        print(\"epoch \" + str(epoch) + \", batch \" + str(i) + \", loss=\" + str(np.mean(loss_)))\n",
    "    if i%10 == 0:\n",
    "        losses.append(np.mean(loss_))\n",
    "        indices.append(i)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.ylim(ymax=np.amax(losses[1:])) # ignore first value for scaling\n",
    "plt.plot(indices, losses)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "PRIMELEN=256\n",
    "RUNLEN=512\n",
    "OFFSET=20\n",
    "RMSELEN=128\n",
    "\n",
    "prime_data = data[OFFSET:OFFSET+PRIMELEN]\n",
    "results = prediction_run(prime_data, RUNLEN)\n",
    "\n",
    "utils_display.picture_this_8(data, prime_data, results, OFFSET, PRIMELEN, RUNLEN, RMSELEN)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright 2018 Google LLC\n",
    "\n",
    "Licensed under the Apache License, Version 2.0 (the \"License\");\n",
    "you may not use this file except in compliance with the License.\n",
    "You may obtain a copy of the License at\n",
    "[http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)\n",
    "Unless required by applicable law or agreed to in writing, software\n",
    "distributed under the License is distributed on an \"AS IS\" BASIS,\n",
    "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
    "See the License for the specific language governing permissions and\n",
    "limitations under the License."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
